%---------------------------------------------------------------------------%
%->> Frontmatter
%---------------------------------------------------------------------------%
%-
%-> 生成封面
%-
\maketitle% 生成中文封面
\MAKETITLE% 生成英文封面
%-
%-> 作者声明
%-
% \makedeclaration% 生成声明页
%-
%-> 中文摘要
%-
\intobmk\chapter*{摘\quad 要}% 显示在书签但不显示在目录
\setcounter{page}{1}% 开始页码
\pagenumbering{Roman}% 页码符号
本文主要探究了常见英文加数字型图形验证码的两类有效机器识别方法。

对于字符位置固定、无旋转、无扭曲变形的英文验证码，首先对验证码进行灰化，去噪，切割等预处理流程，
然后人工抽取验证码字符的数字特征，之后将图像数据与验证码字符的数字特征进行对比，
获取各字符的相似度并进行排序来得到验证码识别结果，最终该方法达到了99\%的识别率。

对于字符随机偏移、旋转、扭曲变形的英文验证码，本文探索使用基于卷积神经网络的深度学习方法，
通过Tensorflow深度学习框架搭建基于CNN的端到端验证码识别模型，然后使用20000个验证码数据对模型进行3000轮左右的训练，
最终该方法在训练集上达到了100\%的识别率，在测试集上获得了97\%的识别率。

从识别率来看，这两种验证码的机器识别方法是有效的，甚至达到并超过了人眼的识别率。
其中，基于CNN的端到端验证码识别模型具有很强的泛化能力，只需更换验证码训练数据集，重新训练模型，即可解决大多数类型的验证码识别问题。

\keywords{数字特征，深度学习，卷积神经网络}% 中文关键词
%-
%-> 英文摘要
%-
\intobmk\chapter*{Abstract}% 显示在书签但不显示在目录

This paper mainly explores two types of effective machine identification methods commonly used in English validation codes.

For English validation codes with fixed character positions, no rotation, no distortion, we first perform the pre-processing processes such as ashing, denoising, and cutting.
Then manually extract the digital features of the validation code characters, and then compare the image data with the digital features of the validation code characters.
The similarity of each character is obtained and sorted to obtain the validation code recognition result. Finally, the method achieves a recognition rate of 99\%.

For English validation codes with random offset, rotation, and distortion of characters, we explores the use of deep learning methods based on convolutional neural networks and
build a CNN-based end-to-end captcha recognition model through the Tensorflow deep learning framework, and then use the 20,000 captcha data to train the model for about 3000 rounds.
In the end, the method achieved 100\% recognition rate on the training set and 97\% recognition rate on the test set.

Judging from the recognition rate, the two machine identification methods of validation codes are effective, and even reach and exceed the recognition rate of the human.
Among them, the CNN-based end-to-end captcha recognition model has a strong generalization ability. Simply replacing the captcha training data set and retraining the model can solve most types of captcha recognition problems.

\KEYWORDS{deep learning, CNN}% 英文关键词
%---------------------------------------------------------------------------%
